Abstract: Frequent item set mining (FIM) is one of the most fundamental problems in data mining. In this paper, a differentially private FIM algorithm is given which can not only achieve high data utility and a high degree of privacy, but also offer high time efficiency. A differentially private FIM algorithm is based on the FP-growth algorithm, which is referred to as PFP-growth. The PFP-growth algorithm consists of two main phases, preprocessing phase and mining phase. In the preprocessing phase, a smart splitting method is used to transform the database. In the mining phase, to cover the information loss caused by transaction splitting, we used a run-time estimation method to estimate the actual support of item sets in the original database. Through formal privacy analysis, we show that our PFP-growth algorithm is differentially private. After this one-to-many data linkage method is used to link different item sets and this method is based on One Class Clustering Tree (OCCT).

Keywords: Frequent Item set Mining, Differential Privacy, Smart Splitting, OCCT, Data Linkage, and Clustering Tree.